About this Journal  |  Author Guidelines  |   Submit a Manuscript     

International Journal of Software Engineering for Smart Device

Volume 13, No. 1, 2019, pp 17-24
http://dx.doi.org/10.21742/ijseia.2019.13.1.03

Abstract



A Framework for Evaluating Performance of Algorithms Extracting the Main Content from a Web Page



    Minwoo Park, Geunseong Jung, Kwanguk Kim, Hansung Kim, Jaehyuk Cha*
    Department of Computer Science, Hanyang University, Seoul, Korea
    {pmw9027, aninteger, kenny, hsk, chajh*}@hanyang.ac.kr

    Abstract

    Main content extraction is a core element of web mining which extracts only those areas that have independent information of single web page. Because of rapid change of a web technology, it is hard to maintain experiment environment accurately. Furthermore, its scale, which varies depending on the approach of each algorithm, should be unified based on approach. Finally, sufficient flexibility to be able to respond to algorithms and web environments developed in the future should be guaranteed. In this work, an extensible performance evaluation framework is proposed that can apply several comparison scales and provide safe management functions in web page changes by saving test targets when comparing the performances of several main content extraction algorithms.


 

Contact Us

  • PO Box 5074, Sandy Bay Tasmania 7005, Australia
  • Phone: +61 3 9028 5994